5 research outputs found
Bounding Box-Free Instance Segmentation Using Semi-Supervised Learning for Generating a City-Scale Vehicle Dataset
Vehicle classification is a hot computer vision topic, with studies ranging
from ground-view up to top-view imagery. In remote sensing, the usage of
top-view images allows for understanding city patterns, vehicle concentration,
traffic management, and others. However, there are some difficulties when
aiming for pixel-wise classification: (a) most vehicle classification studies
use object detection methods, and most publicly available datasets are designed
for this task, (b) creating instance segmentation datasets is laborious, and
(c) traditional instance segmentation methods underperform on this task since
the objects are small. Thus, the present research objectives are: (1) propose a
novel semi-supervised iterative learning approach using GIS software, (2)
propose a box-free instance segmentation approach, and (3) provide a city-scale
vehicle dataset. The iterative learning procedure considered: (1) label a small
number of vehicles, (2) train on those samples, (3) use the model to classify
the entire image, (4) convert the image prediction into a polygon shapefile,
(5) correct some areas with errors and include them in the training data, and
(6) repeat until results are satisfactory. To separate instances, we considered
vehicle interior and vehicle borders, and the DL model was the U-net with the
Efficient-net-B7 backbone. When removing the borders, the vehicle interior
becomes isolated, allowing for unique object identification. To recover the
deleted 1-pixel borders, we proposed a simple method to expand each prediction.
The results show better pixel-wise metrics when compared to the Mask-RCNN (82%
against 67% in IoU). On per-object analysis, the overall accuracy, precision,
and recall were greater than 90%. This pipeline applies to any remote sensing
target, being very efficient for segmentation and generating datasets.Comment: 38 pages, 10 figures, submitted to journa
Panoptic Segmentation Meets Remote Sensing
Panoptic segmentation combines instance and semantic predictions, allowing
the detection of "things" and "stuff" simultaneously. Effectively approaching
panoptic segmentation in remotely sensed data can be auspicious in many
challenging problems since it allows continuous mapping and specific target
counting. Several difficulties have prevented the growth of this task in remote
sensing: (a) most algorithms are designed for traditional images, (b) image
labelling must encompass "things" and "stuff" classes, and (c) the annotation
format is complex. Thus, aiming to solve and increase the operability of
panoptic segmentation in remote sensing, this study has five objectives: (1)
create a novel data preparation pipeline for panoptic segmentation, (2) propose
an annotation conversion software to generate panoptic annotations; (3) propose
a novel dataset on urban areas, (4) modify the Detectron2 for the task, and (5)
evaluate difficulties of this task in the urban setting. We used an aerial
image with a 0,24-meter spatial resolution considering 14 classes. Our pipeline
considers three image inputs, and the proposed software uses point shapefiles
for creating samples in the COCO format. Our study generated 3,400 samples with
512x512 pixel dimensions. We used the Panoptic-FPN with two backbones
(ResNet-50 and ResNet-101), and the model evaluation considered semantic
instance and panoptic metrics. We obtained 93.9, 47.7, and 64.9 for the mean
IoU, box AP, and PQ. Our study presents the first effective pipeline for
panoptic segmentation and an extensive database for other researchers to use
and deal with other data or related problems requiring a thorough scene
understanding.Comment: 40 pages, 10 figures, submitted to journa
Recognising Three-Dimensional Objects using Parameterized Volumetric Models
Centre for Intelligent Systems and their ApplicationsThis thesis addressed the problem of recognizing 3-D objects, using shape information extracted from range images, and parameterized volumetric models. The domains of the geometric shapes explored is that of complex curved objects with articulated parts, and a great deal of similarity between some of the parts. These objects are exemplified by animal shapes, however the general characteristics and complexity of these shapes are present in a wide range of other natural and man-made objects.
In model-based object recognition three main issues constrain the design of a complete solution: representation, feature extraction, and interpretation. this thesis develops an integrated approach that addresses these three issues in the context of the above mentioned domain of objects. For representation I propose a composite description using globally deformable superquadratics and a set of volumetric primitives called geons: this description is shown to have representational and discriminative properties suitable for recognition. Feature extraction comprises a segmentation process which develops a method to extract a parts-based description of the objects as assemblies of defoemable superquadratics. Discontinuity points detected from the images are linked using 'active contour' minimization technique, and deformable superquadratic models are fitted to the resulting regions afterwards. Interpretation is split into three components: classification of parts, matching, and pose estimation. A Radical Basis Function [RBF] classifier algoritm is presented in order to classify the superquadratics shapes derived from the segmentation into one of twelve geon classes. The matching component is decomposed into two stages: first, an indexing scheme which makes effective use of the output of the [RBF] classifier in order to direct the search to the models which contain the parts identified. this makes the search more efficient, and with a model library that is organised in a meaningful and robust way, permits growth without compromising performance. Second, a method is proposed where the hypotheses picked from the index are searched using an Interpretation Tree algorithm combined with a quality measure to evaluate the bindings and the final valid hypotheses based on Possibility Theory, or Theory of Fuzzy Sets. The valid hypotheses ranked by the matching process are then passed to the pose estimation module. This module uses a Kalman Filter technique that includes the constraints on the articulations as perfect measurements, and as such provides a robust and generic way to estimate pose in object domains such as the one approached here.
These techniques are then combined to produce an integrated approach to the object recognition task. The thesis develops such an integrated approach, and evaluates its perfomance inthe sample domain. Future extensions of each technique and the overall integration strategy are discussed
Instance Segmentation for Governmental Inspection of Small Touristic Infrastructure in Beach Zones Using Multispectral High-Resolution WorldView-3 Imagery
Misappropriation of public lands is an ongoing government concern. In Brazil, the beach zone is public property, but many private establishments use it for economic purposes, requiring constant inspection. Among the undue targets, the individual mapping of straw beach umbrellas (SBUs) attached to the sand is a great challenge due to their small size, high presence, and agglutinated appearance. This study aims to automatically detect and count SBUs on public beaches using high-resolution images and instance segmentation, obtaining pixel-wise semantic information and individual object detection. This study is the first instance segmentation application on coastal areas and the first using WorldView-3 (WV-3) images. We used the Mask-RCNN with some modifications: (a) multispectral input for the WorldView3 imagery (eight channels), (b) improved the sliding window algorithm for large image classification, and (c) comparison of different image resizing ratios to improve small object detection since the SBUs are small objects (2 pixels) even using high-resolution images (31 cm). The accuracy analysis used standard COCO metrics considering the original image and three scale ratios (2Ă, 4Ă, and 8Ă resolution increase). The average precision (AP) results increased proportionally to the image resolution: 30.49% (original image), 48.24% (2Ă), 53.45% (4Ă), and 58.11% (8Ă). The 8Ă model presented 94% AP50, classifying nearly all SBUs correctly. Moreover, the improved sliding window approach enables the classification of large areas providing automatic counting and estimating the size of the objects, proving to be effective for inspecting large coastal areas and providing insightful information for public managers. This remote sensing application impacts the inspection cost, tribute, and environmental conditions
Instance Segmentation for Governmental Inspection of Small Touristic Infrastructure in Beach Zones Using Multispectral High-Resolution WorldView-3 Imagery
Misappropriation of public lands is an ongoing government concern. In Brazil, the beach zone is public property, but many private establishments use it for economic purposes, requiring constant inspection. Among the undue targets, the individual mapping of straw beach umbrellas (SBUs) attached to the sand is a great challenge due to their small size, high presence, and agglutinated appearance. This study aims to automatically detect and count SBUs on public beaches using high-resolution images and instance segmentation, obtaining pixel-wise semantic information and individual object detection. This study is the first instance segmentation application on coastal areas and the first using WorldView-3 (WV-3) images. We used the Mask-RCNN with some modifications: (a) multispectral input for the WorldView3 imagery (eight channels), (b) improved the sliding window algorithm for large image classification, and (c) comparison of different image resizing ratios to improve small object detection since the SBUs are small objects (<322 pixels) even using high-resolution images (31 cm). The accuracy analysis used standard COCO metrics considering the original image and three scale ratios (2×, 4×, and 8× resolution increase). The average precision (AP) results increased proportionally to the image resolution: 30.49% (original image), 48.24% (2×), 53.45% (4×), and 58.11% (8×). The 8× model presented 94% AP50, classifying nearly all SBUs correctly. Moreover, the improved sliding window approach enables the classification of large areas providing automatic counting and estimating the size of the objects, proving to be effective for inspecting large coastal areas and providing insightful information for public managers. This remote sensing application impacts the inspection cost, tribute, and environmental conditions